I upgraded a few months ago to subversion 1.0, hoping that the database corruption problems I had seen repeatadly with the pre-1.0 series would have been ironed out. Unfortunatly, not yet.
Having re-built all my repositories on a new server (the old one had disk coruption issues, which aggitated the problems with subversion). I got the first corruption, thankfully I have a copy on commit script, so although this new server was not backed up, at least the current state of the repository is retrievable.
Personally I think the problem behind subversion is a serious design flaw, depending on a single file (db/strings) to store the whole repository, to put it politely, is not particulary smart..., not only does it get huge too easily, but any corruption in the filesystem, or file leaves you succeptable to loosing a whole repository. Perhaps a per directory strings file, and a database of directories, would help a bit.. - at least you would only loose a directory occasionally. - but realistically CVS's per file archive, and a master log would be a far better solution.
In conclusion
- Never run a subversion server without a backup script! (you may as well be putting your data in a trash can)
Comments
Well I've been using Subversion for a couple of months now and I haven't had any problems at all. Actually it's been _much_ more stable than any other source control that I've used. Also keeping any source control (or pretty much anything else) without a backup is a stupid thing.
What constitues a backup commit script? Are these problems documented elsewhere? I currently rely on subversion and this is troublesome. Initially I found the arch syntax combersome but perhaps I sould switch to arch...
culley
Basically something like this in a cron job.
svnadmin dump /my/repos | gzip | /var/backup.$DATE.gz
Although it's not very effecient (eats disk space alive).
I looked into this a little more and svnadmin has some facilities for creating backups (svnadmin hotcopy) and there is a python wrapper around it too (hot-backup.py).
http://svnbook.red-bean.com/svnbook/ch05s03.html#svn-ch-5-sect-3.6
I feel pretty safe using this for my repository which is quite small. This probably wouldn't be too practical for a big project like, say, postgresql...
culley
A single file is what I expect from a database. Maybe one file for log and one file for data. And hot backups, of course, from database or log. I don't understand why so many files if one of the big changes from cvs to subversion it's the consistent commit. If I lost one file it became out of sync with other code. Every source control server *must* be backed up in a consistent way. I don't think the Berkeley database about use a lot of files is useful, because a database is yet more sensible regarding to consistence. The decision of splitting the data across files supposes to be a database (in this case a repository) administrator's task.